Using Randomized Quantile Residuals to Assess Model Fit of Generalized Linear Mixed Models
Julia Piaskowski
May 14, 2025
Why?
Randomized quantile residual was proposed in the literature [in 1996] to circumvent the…problems in traditional residuals [for generalized linear models]. However, this approach has not gained deserved awareness and attention, partly due to the lack of extensive empirical studies to investigate its performance.
–Feng et al, 2017
Deep dive into quantile residuals for model evaluation and the most popular implementation of them, DHARMa.
State of DHARMa
DHARMa was first released in 2016. CRAN downloads were sparse until 2019, and began steadily increasing. It averages approximately ~500 downloads per day (source: https://hadley.shinyapps.io/cran-downloads/).
It has reverse dependencies with 18 R packages including glmmTMB, performance, easystats.
DHARMa does not have a published peer-reviewed manuscript supporting its release.
Randomization is used to produce continuously distributed residuals when the response is discrete or has a discrete component. This means that the quantile residuals will vary from one realization to another for a given data set and fitted model.
–Smythe & Dunn (1996)
DHARMa
(Diagnostics for HierArchical Regression Models)
The DHARMa Process
Model a process using a generalized linear model with a given distribution and link function.
Simulate new observations (\(n_{sim} = n_{data}\)) using fitted values as the model distributional parameters (e.g. shape, scale) as appropriate. \[ Y_i \sim \mathcal{P}(\hat{\mu_i}, \hat{\phi}) \]
Calculate the quantile for the cumulative distribution function
\[ r_{i,q} = F(y_i; \hat{\mu_i}, \hat{\phi} )\]
Count variables: add a random value \(\sim \mathcal{U}(a, b)\)
\(a =\) empirical probability simulated values are less than observed
\(b =\) empirical probability simulated values are less than or equal to observed
DHARMa runs 250 simulations per observation by default, they recommend up to 1000
Poisson Example
\[ (\hat{Y_i} = \lambda = 5,; \quad Y_i = 7 )\]
Quantile Residuals for 1 Observation
Poisson GLMM Example
Create a data set with a random effect (“group”, 10 levels), a covariate (“Environment1”) and a Poisson-distributed response variable (“observedResponse”):
testData =createData(sampleSize =500)
Analyze correctly and incorrectly:
rightModel <-glmer(observedResponse ~ Environment1 + (1|group) , family ="poisson", data = testData)wrongModel <-lmer(observedResponse ~ Environment1 + (1|group) , data = testData)
Kolmogorov-Smirnov test for uniformity (ks.test()) against a uniform distribution, (0, 1)
Dispersion test: compares the variance in observations to the variance of the simulations; bootstrap approach: the p-val is the frequency a data set of observed value exceeds or is below an expectation (two-sided tests both)
Outliers and outlier test: outliers are residual values of 0 or 1. Outlier test uses the binomial test (binom.test(), evaluating if the number of outliers is appropriate for the data set size (expectation: n = 1/(nSim +1).
Quantile tests: fits splines (qgam::qgam()) at 0.25, 0.5 and 0.75 quantiles and tests for deviations from the expectation of a flat line, using a Benjamini-Hochberg adjustment.
Quantile Residuals
Incorrectly Specified Model
Distributions of Quantile Residuals
Quantile Function of the Standard Normal Distribution
Notes on DHARMa Implementation
It depends on the simulation functions build into GLMM package
Feng et al (2017) demonstrated for many scenarios (misspecified models for gamma, neg bin, Poisson and zero-inflated models) that quantile residuals are a better diagnostic tool for checking model distribution, model parameters, over/under dispersion, and overall lack of fit of GLMMs than ‘standard’ residuals.
Fully conditional simulations are recommended for an omnibus diagnostic; if the model is correct, this should no matter, but the conditional simulation is more sensitive.
We do not know their overall sensitivity to modest model misspecifications.
Results from quantiles tests for uniformity should be treated with caution - no pattern proves a model is correct; likewise, non-random patterns are not always a deal-breaker.
Outliers identified are warnings of possible outliers and lack quantitative information.
Final Thoughts
There is an overall lack of information on using quantile residuals in a mixed model context and the application of conditional versus marginal quantile residuals.
RTFM: read the DHARMa vignette and when needed, function documentation. Not all all R code documentation is well written and helpful, but this package is very helpful.
The quantile residuals described herein do not apply to multinomial models. An extension of quantile residuals for these models has been developed, but it lacks an easy implememtation.
Sources
Bates DM, Maechler M, Bolker BM and S Walker (2015). “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
Brooks ME, Kristensen K, van Benthem KJ, Magnusson A, Berg CW, Nielsen A, Skaug HJ, Maechler M and BM Bolker (2017). “glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling.” The R Journal, 9(2), 378-400. doi: 10.32614/RJ-2017-066.
Dunn KP, and GK Smyth (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics 5, 1-10.
Feng et al (2017). Randomized quantile residuals: an omnibus model diagnostic tool with unified reference distribution. arXivhttps://doi.org/10.48550/arXiv.1708.08527
Gelman A and J Hill (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, New York.
Sources
Gerber EAE and BA Craig (2024) Residuals and diagnostics for multinomial regression models. Statistical Analysis and Data Mining: An ASA Data Science Journal 17:e11645. https://doi.org/10.1002/sam.11645
Hartig F (2022). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level/Mixed) Regression Models. R package version 0.4.6, https://CRAN.R-project.org/package=DHARMa.
Liu D, Zhang H (2018) Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach. J Am Stat Assoc 113:845–854. https://doi.org/10.1080/01621459.2017.1292915
Pinheiro JC and DM Bates (2000). Mixed-Effects Models in S and S-PLUS. Springer Verlag, New York.
Stroup WW, Ptukhina M and J Garai (2024). Generalize Linear Mixed Models: Modern Concepts, Methods and Applications. CRC Press, Boca Raton.
(G)LMMs are hard - harder than you may think based on what you may have learned in your second statistics class….